Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 4.433
Filtrar
1.
Elife ; 132024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38568075

RESUMO

Learning invariances allows us to generalise. In the visual modality, invariant representations allow us to recognise objects despite translations or rotations in physical space. However, how we learn the invariances that allow us to generalise abstract patterns of sensory data ('concepts') is a longstanding puzzle. Here, we study how humans generalise relational patterns in stimulation sequences that are defined by either transitions on a nonspatial two-dimensional feature manifold, or by transitions in physical space. We measure rotational generalisation, i.e., the ability to recognise concepts even when their corresponding transition vectors are rotated. We find that humans naturally generalise to rotated exemplars when stimuli are defined in physical space, but not when they are defined as positions on a nonspatial feature manifold. However, if participants are first pre-trained to map auditory or visual features to spatial locations, then rotational generalisation becomes possible even in nonspatial domains. These results imply that space acts as a scaffold for learning more abstract conceptual invariances.


Assuntos
Generalização Psicológica , Aprendizagem , Humanos
2.
PLoS One ; 19(4): e0296841, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38568960

RESUMO

Recent research has shown that comparisons of multiple learning stimuli which are associated with the same novel noun favor taxonomic generalization of this noun. These findings contrast with single-stimulus learning in which children follow so-called lexical biases. However, little is known about the underlying search strategies. The present experiment provides an eye-tracking analysis of search strategies during novel word learning in a comparison design. We manipulated both the conceptual distance between the two learning items, i.e., children saw examples which were associated with a noun (e.g., the two learning items were either two bracelets in a "close" comparison condition or a bracelet and a watch in a "far" comparison condition), and the conceptual distance between the learning items and the taxonomically related items in the generalization options (e.g., the taxonomic generalization answer; a pendant, a near generalization item; versus a bow tie, a distant generalization item). We tested 5-, 6- and 8-year-old children's taxonomic (versus perceptual and thematic) generalization of novel names for objects. The search patterns showed that participants first focused on the learning items and then compared them with each of the possible choices. They also spent less time comparing the various options with one another; this search profile remained stable across age groups. Data also revealed that early comparisons, (i.e., reflecting alignment strategies) predicted generalization performance. We discuss four search strategies as well as the effect of age and conceptual distance on these strategies.


Assuntos
Tecnologia de Rastreamento Ocular , Vocabulário , Criança , Humanos , Idioma , Aprendizagem , Generalização Psicológica
3.
PLoS One ; 19(4): e0297068, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38593127

RESUMO

Compared with visible light images, thermal infrared images have poor resolution, low contrast, signal-to-noise ratio, blurred visual effects, and less information. Thermal infrared sports target detection methods relying on traditional convolutional networks capture the rich semantics in high-level features but blur the spatial details. The differences in physical information content and spatial distribution of high and low features are ignored, resulting in a mismatch between the region of interest and the target. To address these issues, we propose a local attention-guided Swin-transformer thermal infrared sports object detection method (LAGSwin) to encode sports objects' spatial transformation and orientation information. On the one hand, Swin-transformer guided by local attention is adopted to enrich the semantic knowledge of low-level features by embedding local focus from high-level features and generating high-quality anchors while increasing the embedding of contextual information. On the other hand, an active rotation filter is employed to encode orientation information, resulting in orientation-sensitive and invariant features to reduce the inconsistency between classification and localization regression. A bidirectional criss-cross fusion strategy is adopted in the feature fusion stage to enable better interaction and embedding features of different resolutions. At last, the evaluation and verification of multiple open-source sports target datasets prove that the proposed LAGSwin detection framework has good robustness and generalization ability.


Assuntos
Fontes de Energia Elétrica , Exame Físico , Generalização Psicológica , Conhecimento , Luz
4.
Cogn Sci ; 48(4): e13440, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38606615

RESUMO

People implicitly generalize the actions of known individuals in a social group to unknown members. However, actions have social goals and evaluative valences, and the extent to which actions with different valences (helpful and harmful) are implicitly generalized among group members remains unclear. We used computer animations to simulate social group actions, where helping and hindering actions were represented by aiding and obstructing another's climb up a hill. Study 1 found that helpful actions are implicitly expected to be shared among members of the same group but not among members of different groups, but no such effect was found for harmful actions. This suggests that helpful actions are more likely than harmful actions to be implicitly generalized to group members. This finding was replicated in Study 2 by increasing the group size from three to five. Study 3 found that the null effect for generalizing harmful actions among group members is not due to the difficulty of detecting action generalization, as both helpful and harmful actions are similarly generalized within particular individuals. Moreover, Study 4 demonstrated that weakening social group information resulted in the absence of implicit generalization for helpful actions, suggesting the specificity of group membership. Study 5 revealed that the generalization of helping actions occurred when actions were performed by multiple group members rather than being repeated by one group member, showing group-based inductive generalization. Overall, these findings support valence-dependent implicit action generalization among group members. This implies that people may possess different knowledge regarding valenced actions on category-based generalization.


Assuntos
Generalização Psicológica , Dinâmica de Grupo , Humanos
5.
Sci Rep ; 14(1): 8906, 2024 04 17.
Artigo em Inglês | MEDLINE | ID: mdl-38632252

RESUMO

People correct for movement errors when acquiring new motor skills (de novo learning) or adapting well-known movements (motor adaptation). While de novo learning establishes new control policies, adaptation modifies existing ones, and previous work have distinguished behavioral and underlying brain mechanisms for each motor learning type. However, it is still unclear whether learning in each type interferes with the other. In study 1, we use a within-subjects design where participants train with both 30° visuomotor rotation and mirror reversal perturbations, to compare adaptation and de novo learning respectively. We find no perturbation order effects, and find no evidence for differences in learning rates and asymptotes for both perturbations. Explicit instructions also provide an advantage during early learning in both perturbations. However, mirror reversal learning shows larger inter-participant variability and slower movement initiation. Furthermore, we only observe reach aftereffects following rotation training. In study 2, we incorporate the mirror reversal in a browser-based task, to investigate under-studied de novo learning mechanisms like retention and generalization. Learning persists across three or more days, substantially transfers to the untrained hand, and to targets on both sides of the mirror axis. Our results extend insights for distinguishing motor skill acquisition from adapting well-known movements.


Assuntos
Generalização Psicológica , Desempenho Psicomotor , Humanos , Destreza Motora , Movimento , Reversão de Aprendizagem , Adaptação Fisiológica
6.
PLoS One ; 19(4): e0300502, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38635515

RESUMO

Fire and smoke detection is crucial for the safe mining of coal energy, but previous fire-smoke detection models did not strike a perfect balance between complexity and accuracy, which makes it difficult to deploy efficient fire-smoke detection in coal mines with limited computational resources. Therefore, we improve the current advanced object detection model YOLOv8s based on two core ideas: (1) we reduce the model computational complexity and ensure real-time detection by applying faster convolutions to the backbone and neck parts; (2) to strengthen the model's detection accuracy, we integrate attention mechanisms into both the backbone and head components. In addition, we improve the model's generalization capacity by augmenting the data. Our method has 23.0% and 26.4% fewer parameters and FLOPs (Floating-Point Operations) than YOLOv8s, which means that we have effectively reduced the computational complexity. Our model also achieves a mAP (mean Average Precision) of 91.0%, which is 2.5% higher than the baseline model. These results show that our method can improve the detection accuracy while reducing complexity, making it more suitable for real-time fire-smoke detection in resource-constrained environments.


Assuntos
Algoritmos , Fumaça , Carvão Mineral , Generalização Psicológica
7.
PLoS One ; 19(4): e0300473, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38635663

RESUMO

High-resolution imagery and deep learning models have gained increasing importance in land-use mapping. In recent years, several new deep learning network modeling methods have surfaced. However, there has been a lack of a clear understanding of the performance of these models. In this study, we applied four well-established and robust deep learning models (FCN-8s, SegNet, U-Net, and Swin-UNet) to an open benchmark high-resolution remote sensing dataset to compare their performance in land-use mapping. The results indicate that FCN-8s, SegNet, U-Net, and Swin-UNet achieved overall accuracies of 80.73%, 89.86%, 91.90%, and 96.01%, respectively, on the test set. Furthermore, we assessed the generalization ability of these models using two measures: intersection of union and F1 score, which highlight Swin-UNet's superior robustness compared to the other three models. In summary, our study provides a systematic analysis of the classification differences among these four deep learning models through experiments. It serves as a valuable reference for selecting models in future research, particularly in scenarios such as land-use mapping, urban functional area recognition, and natural resource management.


Assuntos
Aprendizado Profundo , Tecnologia de Sensoriamento Remoto , Benchmarking , Generalização Psicológica , Imagens, Psicoterapia
8.
Sci Rep ; 14(1): 5695, 2024 03 08.
Artigo em Inglês | MEDLINE | ID: mdl-38459104

RESUMO

The successful integration of neural networks in a clinical setting is still uncommon despite major successes achieved by artificial intelligence in other domains. This is mainly due to the black box characteristic of most optimized models and the undetermined generalization ability of the trained architectures. The current work tackles both issues in the radiology domain by focusing on developing an effective and interpretable cardiomegaly detection architecture based on segmentation models. The architecture consists of two distinct neural networks performing the segmentation of both cardiac and thoracic areas of a radiograph. The respective segmentation outputs are subsequently used to estimate the cardiothoracic ratio, and the corresponding radiograph is classified as a case of cardiomegaly based on a given threshold. Due to the scarcity of pixel-level labeled chest radiographs, both segmentation models are optimized in a semi-supervised manner. This results in a significant reduction in the costs of manual annotation. The resulting segmentation outputs significantly improve the interpretability of the architecture's final classification results. The generalization ability of the architecture is assessed in a cross-domain setting. The assessment shows the effectiveness of the semi-supervised optimization of the segmentation models and the robustness of the ensuing classification architecture.


Assuntos
Inteligência Artificial , Cardiomegalia , Humanos , Cardiomegalia/diagnóstico por imagem , Generalização Psicológica , Coração , Processamento de Imagem Assistida por Computador , Redes Neurais de Computação
9.
Neural Netw ; 174: 106129, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38508044

RESUMO

Multi-task multi-agent systems (MASs) are challenging to model because they involve heterogeneous agents with different behavior patterns that need to cooperate across various tasks. Existing networks for single-agent policies are not suitable for this setting, as they cannot share policies among agents without losing task-specific performance. We propose a novel framework called Role-based Multi-Agent Transformer (RoMAT), which uses a sequence modeling technique and a role-based actor to enable agents to adapt to different tasks and roles in MASs. RoMAT has a modular model architecture, where backbone networks are shared by all agents, but a small part of the parameters (role-based actor) is independent, depending on the agents' exclusive structures. We evaluate RoMAT on several benchmark tasks and show that it can capture the behavior patterns of heterogeneous agents and achieve better performance and generalization than other methods in both single and multi-task settings.


Assuntos
Benchmarking , Generalização Psicológica , Políticas
10.
Neural Netw ; 174: 106258, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38555722

RESUMO

Cropping-and-segmenting pattern parsers often combine diverse inner correlations into a single metric/scheme, resulting in over-generalizations and redundant representations. It is proposed to streamline pattern parsing by using presenting a redundant association elimination network (RAEN) with capsule attention twisters (CATs) and capsule-attention routing agreement (CARA). CATs trim delicate relationships between parts and wholes that are weak and interchangeable. Senior entities can only be updated by primary entities that meet the requirements of inter-part diversity and intra-object cohesiveness. In order to enhance results, CARA is designed to protect against the unnecessary voting signals of traditional routing protocols. Experiments involving facial and human segmentation show that RAEN is better than current remarkable methods, particularly for defining detailed semantic boundaries.


Assuntos
Face , Generalização Psicológica , Humanos , Semântica , Software , 60478
11.
Neural Netw ; 174: 106219, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38442489

RESUMO

Extrapolating future events based on historical information in temporal knowledge graphs (TKGs) holds significant research value and practical applications. In this field, the methods currently utilized can be classified as either embedding-based or logical rule-based. Embedding-based methods depend on learned entity and relation embeddings for prediction, but they suffer from the lack of interpretability due to the opaque reasoning process. On the other hand, logical rule-based methods face scalability challenges as they heavily rely on predefined logical rules. To overcome these limitations, we propose a hybrid model that combines embedding-based and logical rule-based methods to capture deep causal logic. Our model, called the Inductive Reasoning Model based on Interpretable Logical Rule (ILR-IR), aims to provide interpretable insights while effectively predicting future events in TKGs. ILR-IR delves into historical information, extracting valuable insights from logical rules embedded within relations and interaction preferences between entities. By considering both logical rules and interaction preferences, ILR-IR offers a comprehensive perspective for predicting future events. In addition, we propose the incorporation of a one-class augmented matching loss during optimization, which serves to enhance performance of the model during training. We evaluate ILR-IR on multiple datasets, including ICEWS14, ICEWS0515, and ICEWS18. Experimental results demonstrate that ILR-IR outperforms state-of-the-art baselines, showcasing its superior performance in TKG extrapolation reasoning. Moreover, ILR-IR demonstrates remarkable generalization capabilities, even when applied to related datasets that share a common relation vocabulary. This suggests that our proposed model exhibits robust zero-shot reasoning abilities. For interested parties, we have made our code publicly available at https://github.com/mxadorable/ILR-IR.


Assuntos
Reconhecimento Automatizado de Padrão , Resolução de Problemas , Aprendizagem , Generalização Psicológica , Conhecimento
12.
Neural Netw ; 174: 106224, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38479186

RESUMO

Adversarial training has become the mainstream method to boost adversarial robustness of deep models. However, it often suffers from the trade-off dilemma, where the use of adversarial examples hurts the standard generalization of models on natural data. To study this phenomenon, we investigate it from the perspective of spatial attention. In brief, standard training typically encourages a model to conduct a comprehensive check to input space. But adversarial training often causes a model to overly concentrate on sparse spatial regions. This reduced tendency is beneficial to avoid adversarial accumulation but easily makes the model ignore abundant discriminative information, thereby resulting in weak generalization. To address this issue, this paper introduces an Attention-Enhanced Learning Framework (AELF) for robustness training. The main idea is to enable the model to inherit the attention pattern of standard pre-trained model through an embedding-level regularization. To be specific, given a teacher model built on natural examples, the embedding distribution of teacher model is used as a static constraint to regulate the embedding outputs of the objective model. This design is mainly supported with that the embedding feature of standard model is usually recognized as a rich semantic integration of input. For implementation, we present a simplified AELFs that can achieve the regularization with single cross entropy loss via the parameter initialization and parameter update strategy. This avoids the extra consistency comparison operation between embedding vectors. Experimental observations verify the rationality of our argument, and experimental results demonstrate that it can achieve remarkable improvements in generalization under the high-level robustness.


Assuntos
Generalização Psicológica , Aprendizagem , Entropia , Semântica
13.
Neuropsychologia ; 196: 108848, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38432323

RESUMO

This study aimed to investigate whether neurological patients presenting with a bias in line bisection show specific problems in bisecting a line into two equal parts or their line bisection bias rather reflects a special case of a deficit in proportional reasoning more generally. In the latter case, the bias should also be observed for segmentations into thirds or quarters. To address this question, six neglect patients with a line bisection bias were administered additional tasks involving horizontal lines (e.g., segmentation into thirds and quarters, number line estimation, etc.). Their performance was compared to five neglect patients without a line bisection bias, 10 patients with right hemispheric lesions without neglect, and 32 healthy controls. Most interestingly, results indicated that neglect patients with a line bisection bias also overestimated segments on the left of the line (e.g., one third, one quarter) when dissecting lines into parts smaller than halves. In contrast, such segmentation biases were more nuanced when the required line segmentation was framed as a number line estimation task with either fractions or whole numbers. Taken together, this suggests a generalization of line bisection bias towards a segmentation or proportional processing bias, which is congruent with attentional weighting accounts of line bisection/neglect. As such, patients with a line bisection bias do not seem to have specific problems bisecting a line, but seem to suffer from a more general deficit processing proportions.


Assuntos
Lateralidade Funcional , Transtornos da Percepção , Humanos , Transtornos da Percepção/etiologia , Atenção , Viés , Generalização Psicológica , Percepção Espacial
14.
PLoS One ; 19(3): e0293440, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38512838

RESUMO

Recent work has suggested that feedforward residual neural networks (ResNets) approximate iterative recurrent computations. Iterative computations are useful in many domains, so they might provide good solutions for neural networks to learn. However, principled methods for measuring and manipulating iterative convergence in neural networks remain lacking. Here we address this gap by 1) quantifying the degree to which ResNets learn iterative solutions and 2) introducing a regularization approach that encourages the learning of iterative solutions. Iterative methods are characterized by two properties: iteration and convergence. To quantify these properties, we define three indices of iterative convergence. Consistent with previous work, we show that, even though ResNets can express iterative solutions, they do not learn them when trained conventionally on computer-vision tasks. We then introduce regularizations to encourage iterative convergent computation and test whether this provides a useful inductive bias. To make the networks more iterative, we manipulate the degree of weight sharing across layers using soft gradient coupling. This new method provides a form of recurrence regularization and can interpolate smoothly between an ordinary ResNet and a "recurrent" ResNet (i.e., one that uses identical weights across layers and thus could be physically implemented with a recurrent network computing the successive stages iteratively across time). To make the networks more convergent we impose a Lipschitz constraint on the residual functions using spectral normalization. The three indices of iterative convergence reveal that the gradient coupling and the Lipschitz constraint succeed at making the networks iterative and convergent, respectively. To showcase the practicality of our approach, we study how iterative convergence impacts generalization on standard visual recognition tasks (MNIST, CIFAR-10, CIFAR-100) or challenging recognition tasks with partial occlusions (Digitclutter). We find that iterative convergent computation, in these tasks, does not provide a useful inductive bias for ResNets. Importantly, our approach may be useful for investigating other network architectures and tasks as well and we hope that our study provides a useful starting point for investigating the broader question of whether iterative convergence can help neural networks in their generalization.


Assuntos
Aprendizagem , Redes Neurais de Computação , Generalização Psicológica
15.
PLoS One ; 19(3): e0299902, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38512917

RESUMO

Accurate identification of small tea buds is a key technology for tea harvesting robots, which directly affects tea quality and yield. However, due to the complexity of the tea plantation environment and the diversity of tea buds, accurate identification remains an enormous challenge. Current methods based on traditional image processing and machine learning fail to effectively extract subtle features and morphology of small tea buds, resulting in low accuracy and robustness. To achieve accurate identification, this paper proposes a small object detection algorithm called STF-YOLO (Small Target Detection with Swin Transformer and Focused YOLO), which integrates the Swin Transformer module and the YOLOv8 network to improve the detection ability of small objects. The Swin Transformer module extracts visual features based on a self-attention mechanism, which captures global and local context information of small objects to enhance feature representation. The YOLOv8 network is an object detector based on deep convolutional neural networks, offering high speed and precision. Based on the YOLOv8 network, modules including Focus and Depthwise Convolution are introduced to reduce computation and parameters, increase receptive field and feature channels, and improve feature fusion and transmission. Additionally, the Wise Intersection over Union loss is utilized to optimize the network. Experiments conducted on a self-created dataset of tea buds demonstrate that the STF-YOLO model achieves outstanding results, with an accuracy of 91.5% and a mean Average Precision of 89.4%. These results are significantly better than other detectors. Results show that, compared to mainstream algorithms (YOLOv8, YOLOv7, YOLOv5, and YOLOx), the model improves accuracy and F1 score by 5-20.22 percentage points and 0.03-0.13, respectively, proving its effectiveness in enhancing small object detection performance. This research provides technical means for the accurate identification of small tea buds in complex environments and offers insights into small object detection. Future research can further optimize model structures and parameters for more scenarios and tasks, as well as explore data augmentation and model fusion methods to improve generalization ability and robustness.


Assuntos
Algoritmos , Redes Neurais de Computação , Fontes de Energia Elétrica , Generalização Psicológica , Chá
16.
Neural Netw ; 174: 106226, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38490117

RESUMO

Convolutional neural networks (CNNs) have gained immense popularity in recent years, finding their utility in diverse fields such as image recognition, natural language processing, and bio-informatics. Despite the remarkable progress made in deep learning theory, most studies on CNNs, especially in regression tasks, tend to heavily rely on the least squares loss function. However, there are situations where such learning algorithms may not suffice, particularly in the presence of heavy-tailed noises or outliers. This predicament emphasizes the necessity of exploring alternative loss functions that can handle such scenarios more effectively, thereby unleashing the true potential of CNNs. In this paper, we investigate the generalization error of deep CNNs with the rectified linear unit (ReLU) activation function for robust regression problems within an information-theoretic learning framework. Our study demonstrates that when the regression function exhibits an additive ridge structure and the noise possesses a finite pth moment, the empirical risk minimization scheme, generated by the maximum correntropy criterion and deep CNNs, achieves fast convergence rates. Notably, these rates align with the mini-max optimal convergence rates attained by fully connected neural network model with the Huber loss function up to a logarithmic factor. Additionally, we further establish the convergence rates of deep CNNs under the maximum correntropy criterion when the regression function resides in a Sobolev space on the sphere.


Assuntos
Algoritmos , Redes Neurais de Computação , Processamento de Linguagem Natural , Biologia Computacional , Generalização Psicológica
17.
PLoS One ; 19(3): e0299471, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38451909

RESUMO

Structural planes decrease the strength and stability of rock masses, severely affecting their mechanical properties and deformation and failure characteristics. Therefore, investigation and analysis of structural planes are crucial tasks in mining rock mechanics. The drilling camera obtains image information of deep structural planes of rock masses through high-definition camera methods, providing important data sources for the analysis of deep structural planes of rock masses. This paper addresses the problems of high workload, low efficiency, high subjectivity, and poor accuracy brought about by manual processing based on current borehole image analysis and conducts an intelligent segmentation study of borehole image structural planes based on the U2-Net network. By collecting data from 20 different borehole images in different lithological regions, a dataset consisting of 1,013 borehole images with structural plane type, lithology, and color was established. Data augmentation methods such as image flipping, color jittering, blurring, and mixup were applied to expand the dataset to 12,421 images, meeting the requirements for deep network training data. Based on the PyTorch deep learning framework, the initial U2-Net network weights were set, the learning rate was set to 0.001, the training batch was 4, and the Adam optimizer adaptively adjusted the learning rate during the training process. A dedicated network model for segmenting structural planes was obtained, and the model achieved a maximum F-measure value of 0.749 when the confidence threshold was set to 0.7, with an accuracy rate of up to 0.85 within the range of recall rate greater than 0.5. Overall, the model has high accuracy for segmenting structural planes and very low mean absolute error, indicating good segmentation accuracy and certain generalization of the network. The research method in this paper can serve as a reference for the study of intelligent identification of structural planes in borehole images.


Assuntos
Rememoração Mental , Reconhecimento Psicológico , Comportamento Compulsivo , Generalização Psicológica , Processamento de Imagem Assistida por Computador
18.
Sci Rep ; 14(1): 5644, 2024 03 07.
Artigo em Inglês | MEDLINE | ID: mdl-38453977

RESUMO

Visual perceptual learning is traditionally thought to arise in visual cortex. However, typical perceptual learning tasks also involve systematic mapping of visual information onto motor actions. Because the motor system contains both effector-specific and effector-unspecific representations, the question arises whether visual perceptual learning is effector-specific itself, or not. Here, we study this question in an orientation discrimination task. Subjects learn to indicate their choices either with joystick movements or with manual reaches. After training, we challenge them to perform the same task with eye movements. We dissect the decision-making process using the drift diffusion model. We find that learning effects on the rate of evidence accumulation depend on effectors, albeit not fully. This suggests that during perceptual learning, visual information is mapped onto effector-specific integrators. Overlap of the populations of neurons encoding motor plans for these effectors may explain partial generalization. Taken together, visual perceptual learning is not limited to visual cortex, but also affects sensorimotor mapping at the interface of visual processing and decision making.


Assuntos
Córtex Visual , Percepção Visual , Humanos , Percepção Visual/fisiologia , Movimentos Oculares , Córtex Visual/fisiologia , Aprendizagem Espacial , Generalização Psicológica
19.
Neural Netw ; 172: 106154, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38309137

RESUMO

Herein, we propose a novel dataset distillation method for constructing small informative datasets that preserve the information of the large original datasets. The development of deep learning models is enabled by the availability of large-scale datasets. Despite unprecedented success, large-scale datasets considerably increase the storage and transmission costs, resulting in a cumbersome model training process. Moreover, using raw data for training raises privacy and copyright concerns. To address these issues, a new task named dataset distillation has been introduced, aiming to synthesize a compact dataset that retains the essential information from the large original dataset. State-of-the-art (SOTA) dataset distillation methods have been proposed by matching gradients or network parameters obtained during training on real and synthetic datasets. The contribution of different network parameters to the distillation process varies, and uniformly treating them leads to degraded distillation performance. Based on this observation, we propose an importance-aware adaptive dataset distillation (IADD) method that can improve distillation performance by automatically assigning importance weights to different network parameters during distillation, thereby synthesizing more robust distilled datasets. IADD demonstrates superior performance over other SOTA dataset distillation methods based on parameter matching on multiple benchmark datasets and outperforms them in terms of cross-architecture generalization. In addition, the analysis of self-adaptive weights demonstrates the effectiveness of IADD. Furthermore, the effectiveness of IADD is validated in a real-world medical application such as COVID-19 detection.


Assuntos
COVID-19 , Destilação , Humanos , Benchmarking , Generalização Psicológica , Privacidade
20.
Neural Netw ; 172: 106125, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38320348

RESUMO

Graph Contrastive Learning (GCL) is increasingly employed in graph representation learning with the primary aim of learning node/graph representations from a predefined pretext task that can generalize to various downstream tasks. Meanwhile, the transition from a specific pretext task to diverse and unpredictable downstream tasks poses a significant challenge for GCL's generalization ability. Most existing GCL approaches maximize mutual information between two views derived from the original graph, either randomly or heuristically. However, the generalization ability of GCL and its theoretical principles are still less studied. In this paper, we introduce a novel metric GCL-GE, to quantify the generalization gap between predefined pretext and agnostic downstream tasks. Given the inherent intractability of GCL-GE, we leverage concepts from information theory to derive a mutual information upper bound that is independent of the downstream tasks, thus enabling the metric's optimization despite the variability in downstream tasks. Based on the theoretical insight, we propose InfoAdv, a GCL framework to directly enhance generalization by jointly optimizing GCL-GE and InfoMax. Extensive experiments validate the capability of InfoAdv to enhance performance across a wide variety of downstream tasks, demonstrating its effectiveness in improving the generalizability of GCL.


Assuntos
Teoria da Informação , Aprendizagem , Generalização Psicológica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...